Evaluation of a speech-driven telephone information service using the PARADISE framework: a closer look at subjective measures
نویسندگان
چکیده
• Many evaluations of spoken dialogue systems are focussed on defining and collecting quantitative measures. • Other studies concentrate on users' perceptions only. • Only a few studies describe quality ratings in terms of both system properties and user judgements. • We used the PARADISE framework (i) to gain insight into the factors affecting the user satisfaction of a spoken dialogue system, (ii) to enable a quantitative description of user satisfaction in terms of objective measurements 2. Evaluation of the Information Service " Irene " • 63 participants called the service (explicite confirmation) in Oct-Nov 2002 • 905 user utterances have been recorded. The following data have been collected:-total duration, task duration-# system, user and task turns-# recognition errors-# completed tasks • participants judge the following system/dialogue features, using a 5-point scale:-TTS performance-ASR performance-task ease-interaction pace-user expertise-expected behavior-future use-grade (new measure, 10-point scale) 3. Analysis of the Measures • Table 1. Mean results for the objective measures • Table 2. Mean results for the subjective measures • User satisfaction is the sum of the subjective measures • We applied multivariate linear regression to the data with user satisfaction as the dependent and the objective measures as the independent variables. • Results of stepwise linear regression including the new variable 'grade' (Table 3), and according to Walker's definition without 'grade' (Table 4). Table 3. Table 4. • The subjective and objective Task Success measures contribute to user satisfaction to a similar extent. • The description of the data is more accurate with the extended definition of user satisfaction (including 'grade') • We can now derive a performance function (Table 3): where N is a normalization function that allows the weights to be independent of the scales. • By including 'grade' into the definition of user satisfaction, the description of the data is more accurate. • 'Grade' may reflect user experiences that cannot be expressed by the other subjective measures, but do have a contribution to user satisfaction. • Principal components analyses on subj. measures:-ASR performance, task ease, expected behavior, future use, and grade are described by dimension 1-TTS performance, and user expertise are described by dim. 2-interaction pace does not fit in dimensions 1 and 2. • Future research: improving the definition of user satisfaction or 'quality of service'-by including new measures,-fine-tuning existing measures,-changing relative weights of the measures. N d i a l o g u e d u ra …
منابع مشابه
Evaluation of a Speech-driven Tele Using the PARADISE framework: Measures
For the evaluation of a speech-driven telephone flight information service we applied the PARADISE model developed by Walker and colleagues [1] in order to gain insight into the factors affecting the user satisfaction of this service. We conducted an experiment in which participants were asked to call the service and book a flight. During the telephone conversations quantitative measures (e.g. ...
متن کاملDevelopment of a Speech-Driven Automatic Telephone Service Retrieving pronunciation and spelling of names
This thesis describes the development and evaluation of a speech-driven telephone application for Speechcraft (a company developing speech-driven solutions). The application is created to allow companies to let Speechcraft know the pronunciation along with the spelling of the names of their employees. The names and the spelling are needed to record prompts when Speechcraft creates an automatic ...
متن کاملEvaluating Spoken Language Systems
Spoken language systems (SLSs) for accessing information sources or services through the telephone network and the Internet are currently being trialed and deployed for a variety of tasks. Evaluating the usability of different interface designs requires a method for comparing performance of different versions of the SLS. Recently, Walker et al (1997) proposed PARADISE (PARAdigm for DIalogue Sys...
متن کاملDevelopment of a framework to evaluate service-oriented architecture governance using COBIT approach
Nowadays organizations require an effective governance framework for their service-oriented architecture (SOA) in order to enable them to use a framework to evaluate their current state governance and determine the governance requirements, and then to offer a suitable model for their governance. Various frameworks have been developed to evaluate the SOA governance. In this paper, a brief introd...
متن کاملProviding an Enterprise Architecture Framework Model for Laboratory Information Management Systems by Service Oriented Approach
Background and Aim: Laboratories are one of the most important scientific and research centers. Laboratory information management systems provide a platform for recording the information and collaborating between researchers. The main purpose of this study was suggesting an organizational architecture model of laboratory information management systems. Materials and Methods: This study was a ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2003